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Abstract 

We  show  how  solution  concepts  in  games  such  as 
Nash  equilibrium,  correlated  equilibrium,  rational- 
izability,  and  sequential  equilibrium  can  be  given 
a  uniform  definition  in  terms  of  knowledge-based 
programs.  Intuitively,  all  solution  concepts  are  im¬ 
plementations  of  two  knowledge-based  programs, 
one  appropriate  for  games  represented  in  normal 
form,  the  other  for  games  represented  in  extensive 
form.  These  knowledge-based  programs  can  be 
viewed  as  embodying  rationality.  The  representa¬ 
tion  works  even  if  (a)  information  sets  do  not  cap¬ 
ture  an  agent’s  knowledge,  (b)  uncertainty  is  not 
represented  by  probability,  or  (c)  the  underlying 
game  is  not  common  knowledge. 

1  Introduction 

Game  theorists  represent  games  in  two  standard  ways:  in 
normal  form ,  where  each  agent  simply  chooses  a  strategy, 
and  in  extensive  form ,  using  game  trees,  where  the  agents 
make  choices  over  time.  An  extensive-form  representation 
has  the  advantage  that  it  describes  the  dynamic  stmcture  of 
the  game — it  explicitly  represents  the  sequence  of  decision 
problems  encountered  by  agents.  However,  the  extensive- 
form  representation  purports  to  do  more  than  just  describe 
the  structure  of  the  game;  it  also  attempts  to  represent  the 
information  that  players  have  in  the  game,  by  the  use  of  in¬ 
formation  sets.  Intuitively,  an  information  set  consists  of  a 
set  of  nodes  in  the  game  tree  where  a  player  has  the  same 
information.  However,  as  Halpern  [1997]  has  pointed  out, 
information  sets  may  not  adequately  represent  a  player’s  in¬ 
formation. 

Halpern  makes  this  point  by  considering  the  following 
single-agent  game  of  imperfect  recall,  originally  presented  by 
Piccione  and  Rubinstein  [1997]:  The  game  starts  with  nature 
moving  either  left  or  right,  each  with  probability  1/2.  The 
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Figure  1:  A  game  of  imperfect  recall. 


agent  can  then  either  stop  the  game  (playing  move  S )  and  get 
a  payoff  of  2,  or  continue,  by  playing  move  B.  If  he  contin¬ 
ues,  he  gets  a  high  payoff  if  he  matches  nature’s  move,  and  a 
low  payoff  otherwise.  Although  he  originally  knows  nature’s 
move,  the  information  set  that  includes  the  nodes  labeled  2:3 
and  a;,i  is  intended  to  indicate  that  the  player  forgets  whether 
nature  moved  left  or  right  after  moving  B.  Intuitively,  when 
he  is  at  the  information  set  X ,  the  agent  is  not  supposed  to 
know  whether  he  is  at  X3  or  at  x,-i . 

It  is  not  hard  to  show  that  the  strategy  that  maximizes  ex¬ 
pected  utility  chooses  move  S  at  node  x  \ ,  move  B  at  node 
x-2,  and  move  R  at  the  information  set  X  consisting  of  .1:3  and 
x,\ .  Call  this  strategy  /.  Let  f  be  the  strategy  of  choosing 
move  B  at  x  -\ ,  move  S  at  xx,  and  move  L  at  X.  Piccione  and 
Rubinstein  argue  that  if  node  x  1  is  reached,  the  player  should 
reconsider,  and  decide  to  switch  from  /  to  /'.  As  Halpern 
points  out,  this  is  indeed  true,  provided  that  the  player  knows 
at  each  stage  of  the  game  what  strategy  he  is  currently  using. 
However,  in  that  case,  if  the  player  is  using  /  at  the  infor¬ 
mation  set,  then  he  knows  that  he  is  at  node  x 4;  if  he  has 
switched  and  is  using  /',  then  he  knows  that  he  is  at  X3.  So, 
in  this  setting,  it  is  no  longer  the  case  that  the  player  does  not 
know  whether  he  is  at  X3  or  X4  in  the  information  set;  he  can 
infer  which  state  he  is  at  from  the  strategy  he  is  using. 

In  game  theory,  a  strategy  is  taken  to  be  a  function  from  in¬ 
formation  sets  to  moves  The  intuition  behind  this  is  that,  since 


an  agent  cannot  tell  the  nodes  in  an  information  set  apart,  he 
must  do  the  same  thing  at  all  these  nodes.  But  this  example 
shows  that  if  the  agent  has  imperfect  recall  but  can  switch 
strategies,  then  he  can  arrange  to  do  different  things  at  dif¬ 
ferent  nodes  in  the  same  information  set.  As  Halpern  [1997] 
observes,  ‘  “situations  that  [an  agent]  cannot  distinguish”  and 
“nodes  in  the  same  information  set”  may  be  two  quite  differ¬ 
ent  notions.’  He  suggests  using  the  game  tree  to  describe  the 
structure  of  the  game,  and  using  the  runs  and  systems  frame¬ 
work  [Fagin  et  ai,  1995]  to  describe  the  agent’s  information. 
The  idea  is  that  an  agent  has  an  internal  local  state  that  de¬ 
scribes  all  the  information  that  he  has.  A  strategy  (or  proto¬ 
col  in  the  language  of  [Fagin  et  ah,  1995])  is  a  function  from 
local  states  to  actions.  Protocols  capture  the  intuition  that 
what  an  agent  does  can  depend  only  what  he  knows.  But  now 
an  agent’s  knowledge  is  represented  by  its  local  state,  not  by 
an  information  set.  Different  assumptions  about  what  agents 
know  (for  example,  whether  they  know  their  current  strate¬ 
gies)  are  captured  by  running  the  same  protocol  in  different 
contexts.  If  the  information  sets  appropriately  represent  an 
agent’s  knowledge  in  a  game,  then  we  can  identify  local  states 
with  information  sets.  But,  as  the  example  above  shows,  we 
cannot  do  this  in  general. 

A  number  of  solution  concepts  have  been  considered  in 
the  game-theory  literature,  ranging  from  Nash  equilibrium 
and  correlated  equilibrium  to  refinements  of  Nash  equilib¬ 
rium  such  as  sequential  equilibrium  and  weaker  notions  such 
as  rationalizability }  The  fact  that  game  trees  represent  both 
the  game  and  the  players’  information  has  proved  critical  in 
defining  solution  concepts  in  extensive -form  games.  Can  we 
still  represent  solution  concepts  in  a  useful  way  using  runs 
and  systems  to  represent  a  player’s  information?  As  we  show 
here,  not  only  can  we  do  this,  but  we  can  do  it  in  a  way  that 
gives  deeper  insight  into  solution  concepts.  Indeed,  all  the 
standard  solution  concepts  in  the  literature  can  be  understood 
as  instances  of  a  single  knowledge-based  (kb)  program  [Fagin 
et  al.,  1995;  1997],  which  captures  the  underlying  intuition 
that  a  player  should  make  a  best  response,  given  her  beliefs. 
The  differences  between  solution  concepts  arise  from  running 
the  kb  program  in  different  contexts. 

In  a  kb  program,  a  player’s  actions  depend  explicitly  on 
the  player’s  knowledge.  For  example,  a  kb  program  could 
have  a  test  that  says  “If  you  don’t  know  that  Ann  received  the 
information,  then  send  her  a  message”,  which  can  be  written 

if  -iBj(Ann  received  info)  then  send  Ann  a  message. 

This  kb  program  has  the  form  of  a  standard  if  . . .  then  state¬ 
ment,  except  that  the  test  in  the  if  clause  is  a  test  on  z’s  knowl¬ 
edge  (expressed  using  the  modal  operator  Bi  for  belief;  see 
Section  2  for  a  discussion  of  the  use  of  knowledge  vs.  belief). 

Using  such  tests  for  knowledge  allows  us  to  abstract  away 
from  low-level  details  of  how  the  knowledge  is  obtained.  Kb 
programs  have  been  applied  to  a  number  of  problems  in  the 
computer  science  literature  (see  [Fagin  et  ai,  1995]  and  the 

*We  assume  that  the  reader  is  familiar  with  standard  solution 
concepts  such  as  correlated  equilibrium,  perfect  equilibrium,  and  se¬ 
quential  equilibrium,  as  well  as  the  notion  of  perfect  recall  in  games. 
The  formal  definitions  can  in  any  standard  game  theory  text,  such  as 
[Osborne  and  Rubinstein,  1994], 


references  therein).  We  want  to  apply  kb  program  to  under¬ 
stand  solution  concepts.  Roughly  speaking,  we  want  a  kb 
program  that  says  that  if  player  i  believes  that  she  is  about 
to  do  move  a  (which  we  express  using  the  formula  do.;  (a)), 
and  she  believes  that  she  would  not  do  any  better  with  an¬ 
other  move,  then  she  should  indeed  go  ahead  and  do  a.  This 
test  can  be  viewed  as  embodying  rationality.  There  is  a  sub¬ 
tlety  in  expressing  the  statement  “she  would  not  do  any  better 
with  another  move”.  We  express  this  by  saying  “if  her  ex¬ 
pected  utility,  given  that  she  does  move  a,  is  x ,  then  her  ex¬ 
pected  utility  if  she  were  to  do  move  a '  is  at  most  x .”  The 
“if  she  were  to  do  a!"  is  a  counterfactual  statement.  She 
is  planning  to  do  a,  but  is  contemplating  what  would  hap¬ 
pen  if  she  were  to  do  something  counter  to  fact,  namely, 
a! .  Counterfactuals  have  been  the  subject  of  intense  study 
in  the  philosophy  literature  (see,  for  example,  [Lewis,  1973; 
Stalnaker,  1968])  and,  more  recently,  in  the  game  theory  lit¬ 
erature  (see,  for  example,  [Aumann,  1995;  Halpern,  2001; 
Samet,  1996]).  We  write  the  counterfactual  “If  A  were  the 
case  then  B  would  be  true”  as  “A  A  B”.  Although  this  state¬ 
ment  involves  an  “if  . . .  then”,  the  semantics  of  the  counter- 
factual  implication  ,4  A  B  is  quite  different  from  the  material 
implication  ,4  =>  I).  In  particular,  while  ,4  =>  I)  is  true  if  A 
is  false,  A  A  S  might  not  be. 

With  this  background,  consider  the  following  kb  program 
for  player  i.  In  the  program,  we  use  PMi  to  denote  rs  possi¬ 
ble  moves.  For  a  normal-form  game  I  ’,  PM,  is  <S,;(F),  the  set 
of  pure  strategies  for  player  i  in  I  If  I  ’  is  an  extensive-form 
game,  then  at  a  history  h  where  i  is  to  move,  PM,  consists  of 
all  the  moves  available  to  i  after  history  h. 

for  each  move  a  G  PM,  do 

if  Sj(do.i(a)  A  Va.’((EU.j  =  x)  => 

Aa'ePM.^OiO')  ^  (EU*  <  AO))  then  a. 

This  kb  program  is  meant  to  capture  the  intuition  above. 
Intuitively,  it  says  that  if  player  i  believes  that  she  is  about 
to  do  move  a  and,  if  her  expected  utility  is  x,  then  if  she 
were  to  do  another  move  a' ,  then  her  expected  utility  would 
be  no  greater  than  x,  then  she  should  do  a.  Call  this  kb  pro¬ 
gram  EQr  (with  the  individual  instance  for  player  i  denoted 
by  EQ[  ).2  As  we  show,  if  all  players  follow  EQr,  then 
they  end  up  playing  some  type  of  equilibrium.  Which  type 
of  equilibrium  they  play  depends  on  the  context.  We  start 
by  considering  normal-form  games,  where,  as  we  said,  PM, 
consists  of  the  set  of  pure  strategies  for  player  i.  If  the  players 
have  a  common  prior  on  the  joint  strategies  being  used,  and 
this  common  prior  is  such  that  players’  beliefs  are  indepen¬ 
dent  of  the  strategies  they  use,  then  they  play  a  Nash  equilib¬ 
rium.  Without  this  independence  assumption,  we  get  a  cor¬ 
related  equilibrium.  On  the  other  hand,  if  players  have  pos¬ 
sibly  different  priors  on  the  space  of  strategies,  then  this  kb 
program  defines  rationalizable  strategies  [Bernheim,  1984; 
Pearce,  1984],  Using  a  characterization  due  to  Halpern 
[2006],  we  can  show  that  if  their  prior  is  described  by  a  non¬ 
standard  probability  distribution  and  we  ignore  what  happens 

2Note  that,  although  the  notation  does  not  emphasize  it,  PM ; 
depends  on  T;  in  the  case  of  an  extensive-form  game,  PM;  also 
depends  on  the  current  history  in  the  game. 


on  a  set  of  infinitesimal  probability,  this  kb  program  defines  a 
( trembling -hand )  perfect  equilibrium  [Selten,  1975], 

With  extensive-form  games,  we  show  that  again  using  a 
nonstandard  prior,  EQ  defines  both  perfect  equilibrium  and 
sequential  equilibrium  [Kreps  and  Wilson,  1982],  The  dif¬ 
ference  between  them  is  whether  we  interpret  EUj  =  x  as 
meaning  that  the  exact  expected  utility  of  doing  move  a  is 
x,  or  just  the  standard  part  of  the  utility  is  x.  (Essentially, 
this  amounts  to  asking  whether  x  ranges  over  the  standard  or 
nonstandard  real  numbers.)  However,  it  is  important  to  note 
that  for  EQ  to  define  a  sequential  or  perfect  equilibrium,  we 
need  to  assume  that  information  sets  do  correctly  describe  an 
agent’s  knowledge  and  that  the  game  is  one  of  perfect  recall. 
If  we  drop  this  assumption,  we  can  distinguish  between  the 
two  equilibria  for  the  game  described  in  Figure  1 . 

All  these  solution  concepts  are  based  on  expected  utility. 
But  we  can  also  consider  solution  concepts  based  on  other 
decision  rules.  For  example,  Boutilier  and  Hyafil  [2004] 
consider  minimax-regret  equilibria,  where  each  player  uses 
a  strategy  that  is  a  best-response  in  a  minimax-regret  sense  to 
the  choices  of  the  other  players.  Similarly,  we  can  use  max- 
imin  equilibria  [Aghassi  and  Bertsimas,  2006].  As  pointed 
out  by  Chu  and  Halpern  [2003],  all  these  decision  rules  can 
be  viewed  as  instances  of  a  generalized  notion  of  expected 
utility,  where  uncertainty  is  represented  by  a  plausibility  mea¬ 
sure,  a  generalization  of  a  probability  measure,  utilities  are 
elements  of  an  arbitrary  partially  ordered  space,  and  plausi¬ 
bilities  and  utilities  are  combined  using  ©  and  ®,  generaliza¬ 
tions  of  +  and  x .  We  show  in  the  full  paper  that,  just  by  inter¬ 
preting  “EUj  =  u”  appropriately,  we  can  capture  these  more 
exotic  solution  concepts  as  well.  Moreover,  we  can  capture 
solution  concepts  in  games  where  the  game  itself  is  not  com¬ 
mon  knowledge,  or  where  agents  are  not  aware  of  all  moves 
available,  as  discussed  by  Halpern  and  Rego  [2006]. 

Our  approach  thus  provides  a  powerful  tool  for  represent¬ 
ing  solution  concepts,  which  works  even  if  (a)  information 
sets  do  not  capture  an  agent’s  knowledge,  (b)  uncertainty  is 
not  represented  by  probability,  or  (c)  the  underlying  game  is 
not  common  knowledge. 

The  rest  of  this  paper  is  organized  as  follows.  In  Sec¬ 
tion  2,  we  review  the  relevant  background  on  game  theory  and 
knowledge-based  programs.  In  Section  3,  we  show  that  EQr 
and  EQr  characterize  Nash  equilibrium,  correlated  equilib¬ 
rium,  rationalizability,  and  sequential  equilibrium  in  a  game 
r  in  the  appropriate  contexts.  We  conclude  in  Section  4  with 
a  discussion  of  how  our  results  compare  to  other  characteri¬ 
zations  of  solution  concepts. 

2  Background 

In  this  section,  we  review  the  relevant  background  on  games 
and  knowledge-based  programs.  We  describe  only  what  we 
need  for  proving  our  results.  The  reader  is  encouraged  to  con¬ 
sult  [Osborne  and  Rubinstein,  1994]  for  more  on  game  the¬ 
ory,  [Fagin  el  ai,  1995;  1997]  for  more  on  knowledge-based 
programs  without  counterfactuals,  and  [Halpern  and  Moses, 
2004]  for  more  on  adding  counterfactuals  to  knowledge- 
based  programs. 


2.1  Games  and  Strategies 

A  game  in  extensive  form  is  described  by  a  game  tree.  Asso¬ 
ciated  with  each  non-leaf  node  or  history  is  either  a  player — 
the  player  whose  move  it  is  at  that  node — or  nature  (which 
can  make  a  randomized  move).  The  nodes  where  a  player  i 
moves  are  further  partitioned  into  information  sets.  With  each 
run  or  maximal  history  h  in  the  game  tree  and  player  i  we  can 
associate  i’s  utility,  denoted  u.ifh),  if  that  run  is  played.  A 
strategy  for  player  i  is  a  (possibly  randomized)  function  from 
i’s  information  sets  to  moves.  Thus  a  strategy  for  player  i 
tells  player  i  what  to  do  at  each  node  in  the  game  tree  where 
i  is  supposed  to  move.  Intuitively,  at  all  the  nodes  that  player 
i  cannot  tell  apart,  player  i  must  do  the  same  thing.  A  joint 
strategy  S  =  (Si, . . . ,  Sn)  for  the  players  determines  a  distri¬ 
bution  over  paths  in  the  game  tree.  A  normal-form  game  can 
be  viewed  as  a  special  case  of  an  extensive-form  game  where 
each  player  makes  only  one  move,  and  all  players  move  si¬ 
multaneously. 

2.2  Protocols,  Systems,  and  Contexts 

To  explain  kb  programs,  we  must  first  describe  standard  pro¬ 
tocols.  We  assume  that,  at  any  given  point  in  time,  a  player  in 
a  game  is  in  some  local  state.  The  local  state  could  include 
the  history  of  the  game  up  to  this  point,  the  strategy  being 
used  by  the  player,  and  perhaps  some  other  features  of  the 
player’s  type,  such  as  beliefs  about  the  strategies  being  used 
by  other  players.  A  global  state  is  a  tuple  consisting  of  a  local 
state  for  each  player. 

A  protocol  for  player  i  is  a  function  from  player  i’s  lo¬ 
cal  states  to  actions.  For  ease  of  exposition,  we  consider 
only  deterministic  protocols,  although  it  is  relatively  straight¬ 
forward  to  model  randomized  protocols — corresponding  to 
mixed  strategies — as  functions  from  local  states  to  distribu¬ 
tions  over  actions.  Although  we  restrict  to  deterministic  pro¬ 
tocols,  we  deal  with  mixed  strategies  by  considering  distrib¬ 
utions  over  pure  strategies. 

A  run  is  a  sequence  of  global  states;  formally,  a  run  is 
a  function  from  times  to  global  states.  Thus,  r(m)  is  the 
global  state  in  run  r  at  time  to.  A  point  is  a  pair  (r,  to) 
consisting  of  a  run  r  and  time  to.  Fet  r.i(m)  be  i’s  local 
state  at  the  point  (r,  to);  that  is,  if  r(m)  =  (s i, . . . ,  sn),  then 
r.i(m)  =  Si.  A  joint  protocol  is  an  assignment  of  a  protocol 
for  each  player;  essentially,  a  joint  protocol  is  a  joint  strat¬ 
egy.  At  each  point,  a  joint  protocol  P  performs  a  joint  ac¬ 
tion  (Pi(ri(m)) , . . . ,  Pn(i'n(‘m))),  which  changes  the  global 
state.  Thus,  given  an  initial  global  state,  a  joint  protocol  P 
generates  a  (unique)  run,  which  can  be  thought  of  as  an  ex¬ 
ecution  of  P.  The  runs  in  a  normal-form  game  involve  only 
one  round  and  two  time  steps:  time  0  (the  initial  state)  and 
time  1,  after  the  joint  strategy  has  been  executed.  (We  as¬ 
sume  that  the  payoff  is  then  represented  in  the  player’s  local 
state  at  time  1 .)  In  an  extensive-form  game,  a  run  is  again 
characterized  by  the  strategies  used,  but  now  the  length  of  the 
run  depends  on  the  path  of  play. 

A  probabilistic  system  is  a  tuple  VS  =  (1Z,  jl),  where  1Z  is 
a  set  of  runs  and  jl  =  (pi, . . . ,  pn)  associates  a  probablity  pi 
on  the  runs  of  1Z  with  each  player  i.  Intuitively,  //,;  represents 
player  i’s  prior  beliefs.  In  the  special  case  where  p !  =  •••  = 


pn  =  p,  the  players  have  a  common  prior  p  on  TZ.  In  this 
case,  we  write  just  (TZ,  p). 

We  are  interested  in  the  system  corresponding  to  a  joint 
protocol  P.  To  determine  this  system,  we  need  to  describe  the 
setting  in  which  P  is  being  executed.  For  our  purposes,  this 
setting  can  be  modeled  by  a  set  Q  of  global  states,  a  subset  Qo 
of  Q  that  describes  the  possible  initial  global  states,  a  set  A, 
of  possible  joint  actions  at  each  global  state  s,  and  n  probabil¬ 
ity  measures  on  Qo,  one  for  each  player.  Thus,  a  probabilistic 
context  is  a  tuple  7  =  (Q,  Qo,  {A,  :  s  G  Q},p)?  A  joint 
protocol  P  is  appropriate  for  such  a  context  7  if,  for  every 
global  state  s,  the  joint  actions  that  P  can  generate  are  in  A. . 
When  P  is  appropriate  for  7,  we  abuse  notation  slightly  and 
refer  to  7  by  specifying  only  the  pair  (Qo,  p).  A  protocol  P 
and  a  context  7  for  which  P  is  appropriate  generate  a  sys¬ 
tem;  the  system  depends  on  the  initial  states  and  probability 
measures  in  7.  Since  these  are  all  that  matter,  we  typically 
simplify  the  description  of  a  context  by  omitting  the  set  Q  of 
global  states  and  the  sets  A.  of  global  actions.  Let  R (P ,  7) 
denote  the  system  generated  by  joint  protocol  P  in  context  7. 
If  7  =  (tyo,  m).  then  R(.P,  7)  =  ( TZ ,  p'),  where  T Z  consists  of 
a  the  run  rg  for  each  initial  state  s  G  Qo,  where  rg  is  the  run 
generated  by  P  when  started  in  state  s,  and  p'i(rg)  =  Pi(s), 
for  i  =  1, . . . ,  n. 

A  probabilistic  system  (TZ,  p!)  is  compatible  with  a  con¬ 
text  7  =  (Qo,  p)  if  (a)  every  initial  state  in  Qo  is  the  initial 
state  of  some  run  in  TZ,  (b)  every  run  is  the  run  of  some  pro¬ 
tocol  appropriate  for  7,  and  (c)  if  TZ(s)  is  the  set  of  runs  in 
TZ  with  initial  global  state  s,  then  p'j(TZ(s))  =  Pj(s),  for 

j  =  1, . . . ,  n.  Clearly  R(.P,  7)  is  compatible  with  7. 

We  can  think  of  the  context  as  describing  background  in¬ 
formation.  In  distributed-systems  applications,  the  context 
also  typically  includes  information  about  message  delivery. 
For  example,  it  may  determine  whether  all  messages  sent  are 
received  in  one  round,  or  whether  they  may  take  up  to,  say, 
five  rounds.  Moreover,  when  this  is  not  obvious,  the  context 
specifies  how  actions  transform  the  global  state;  for  exam¬ 
ple,  it  describes  what  happens  if  in  the  same  joint  action  two 
players  attempt  to  modify  the  same  memory  cell.  Since  such 
issues  do  not  arise  in  the  games  we  consider,  we  ignore  these 
facets  of  contexts  here.  For  simplicity,  we  consider  only  con¬ 
texts  where  each  initial  state  corresponds  to  a  particular  joint 
strategy  of  T.  That  is,  >fj  is  a  set  of  local  states  for  player  i 
indexed  by  (pure)  strategies.  The  set  >fj  can  be  viewed  as  de¬ 
scribing  *’s  types;  the  state  ss  can  the  thought  of  as  the  initial 
state  where  player  i’s  type  is  such  that  he  plays  S  (although 
we  stress  that  this  is  only  intuition;  player  i  does  not  have  to 
play  S  at  the  state  s$).  Let  Qq  =  x  . . .  x  We  will  be 
interested  in  contexts  where  the  set  of  initial  global  states  is  a 
subset  Qo  of  Q(\ .  In  a  normal -form  game,  the  only  move  pos¬ 
sible  for  player  i  at  an  initial  global  state  is  that  of  choosing 
a  pure  strategy,  so  the  joint  actions  are  joint  strategies;  no  ac- 

3We  are  implicitly  assuming  that  the  global  state  that  results  from 
performing  a  joint  action  in  As  at  the  global  state  s  is  unique  and  ob¬ 
vious;  otherwise,  such  information  would  also  appear  in  the  context, 
as  in  the  general  framework  of  [Fagin  et  al.,  1995]. 


tions  are  possible  at  later  times.  For  an  extensive-form  game, 
the  possible  moves  are  described  by  the  game  tree.  We  say 
that  a  context  for  an  extensive-form  game  is  standard  if  the 
local  states  have  the  form  ( s ,  I),  where  s  is  the  initial  state 
and  I  is  the  current  information  set.  In  a  standard  context, 
an  agent’s  knowledge  is  indeed  described  by  the  information 
set.  However,  we  do  not  require  a  context  to  be  standard. 
For  example,  if  an  agent  is  allowed  to  switch  strategies,  then 
the  local  state  could  include  the  history  of  strategies  used.  In 
such  a  context,  the  agent  in  the  game  of  Figure  1  would  know 
more  than  just  what  is  in  the  information  set,  and  would  want 
to  switch  strategies. 

2.3  Knowledge-Based  Programs 

A  knowledge-based  program  is  a  syntactic  object.  For  our 
purposes,  we  can  take  a  knowledge-based  program  for  player 
i  to  have  the  form 

if  Ki  then  ai 
if  K2  then  a 2 

.  .  .  , 

where  each  k,  is  a  Boolean  combination  of  formulas  of  the 
form  Bi<p,  in  which  the  p’s  can  have  nested  occurrences  of  Bg 
operators  and  counterfactual  implications.  We  assume  that 
the  tests  tti,  K2,  •  •  •  are  mutually  exclusive  and  exhaustive,  so 
that  exactly  one  will  evaluate  to  true  in  any  given  instance. 
The  program  EQf  can  be  written  in  this  form  by  simply  re¬ 
placing  the  for  ...  do  statement  by  one  line  for  each  pure 
strategy  in  <S.j(r);  similarly  for  EQ.J  . 

We  want  to  associate  a  protocol  with  a  kb  program.  Unfor¬ 
tunately,  we  cannot  “execute”  a  kb  program  as  we  can  a  pro¬ 
tocol.  How  the  kb  program  executes  depends  on  the  outcome 
of  tests  Kj.  Since  the  tests  involve  beliefs  and  counterf  actuals, 
we  need  to  interpret  them  with  respect  to  a  system.  The  idea 
is  that  a  kb  program  Pg,  for  player  i  and  a  probabilistic  sys¬ 
tem  PS  together  determine  a  protocol  P  for  player  i.  Rather 
than  giving  the  general  definitions  (which  can  be  found  in 
[Halpern  and  Moses,  2004]),  we  just  show  how  they  work  in 
the  kb  program  that  we  consider  in  this  paper:  EQ. 

Given  a  system  VS  =  (TZ,  p),  we  associate  with  each  for¬ 
mula  p  a  set  [^l-ps  of  points  in  VS.  Intuitively,  is  the 

set  of  points  of  VS  where  the  formula  p  is  true.  We  need  a 
little  notation: 

•  If  if  is  a  set  of  points  in  VS,  let  TZ(E)  denote  the  set 
of  runs  going  through  points  in  if;  that  is  TZ(E)  =  {r  : 
3 m((r,  to)  G  if)}. 

•  Let  1C,  (r,  to)  denote  the  set  of  points  that  i  cannot  dis¬ 
tinguish  from  (r,  to):  ICi(r,m)  =  {( r’,m ')  :  (r'^m')  = 
r.i(m)}.  Roughly  speaking,  1C g(r,  to)  corresponds  to  i’s 
information  set  at  the  point  (r,  to). 

•  Given  a  point  (r,  to)  and  a  player  i,  let  P(iir,m)  be  the 
probability  measure  that  results  from  conditioning  pi'  on 
fCi(r,ni),  i’s  information  at  (r,  to).  We  cannot  condi¬ 
tion  on  ICi  (r,  to)  directly:  pl  is  a  probability  measure 
on  runs,  and  fCi(r,m)  is  a  set  of  points.  So  we  actu¬ 
ally  condition,  not  on  (Q  (r,  to),  but  on  TZ(lCi  (r,  to)),  the 
set  of  runs  going  through  the  points  in  tC.i(r,  to).  Thus, 


fj,itr,m  =  fA  I  7?(iQ(r,  to)).  (For  the  purposes  of  this  ab¬ 
stract,  we  do  not  specify  pitr,m  if  AL(T?(iQ(r,  to)))  =  0. 
It  turns  out  not  to  be  relevant  to  our  discussion.) 

The  kb  programs  we  consider  in  this  paper  use  a  limited 
collection  of  formulas.  We  now  can  define  [^J-ps  for  the 
formulas  we  consider  that  do  not  involve  counterf actuals. 

•  [doj(a)]ps  is  the  set  of  points  (r,  to)  of  VS  at  which  i 
performs  action  a. 

•  Player  |  believes  a  formula  p  at  a  point  (r,  to)  if  the 

event  corresponding  to  formula  p  has  probability  1  ac¬ 
cording  to  That  is,  (r,  to)  G  \Bip\ps  if 

Pi(lZ(K,i(r,  to))  ^  0  (so  that  conditioning  on  iQ(r,  to) 
is  defined)  and  Pi,r,m(l<plvs  H  iQ(r,  to))  =  1. 

•  With  every  run  r  in  the  systems  we  consider,  we  can  as¬ 
sociate  the  joint  (pure)  strategy  S  used  in  r.4  This  pure 
strategy  determines  the  history  in  the  game,  and  thus  de¬ 
termines  player  i’s  utility.  Thus,  we  can  associate  with 
every  point  (r,  to)  player  i’s  expected  utility  at  (r,  to), 
where  the  expectation  is  taken  with  respect  to  the  prob¬ 
ability  pitr,m-  If  u  is  a  real  number,  then  [EU.,  =  ujps 
is  the  set  of  points  where  player  i’s  expected  utility  is  u\ 
[EUj  <  uj-ps  is  defined  similarly. 

•  Assume  that  p(x)  has  no  occurrences  of  V.  Then 
l\/x<p(x))jpS  =  naeMl<p[x/a]}pS,  where  p[x/a\  is 
the  result  of  replacing  all  occurrences  of  x  in  p  by  a. 
That  is,  Va;  is  just  universal  quantification  over  x,  where 
x  ranges  over  the  reals.  This  quantification  arises  for  us 
when  x  represents  a  utility,  so  that  \/xp(x)  is  saying  that 
( p  holds  for  all  choices  of  utility. 

We  now  give  the  semantics  of  formulas  involving  counter- 
factuals.  Here  we  consider  only  a  restricted  class  of  such  for¬ 
mulas,  those  where  the  counterfactual  only  occurs  in  the  form 
do,;  (a)  7  p,  which  should  be  read  as  “if  i  were  to  do  move 
a,  then  p  would  be  true”.  Intuitively,  do, (a)  A  p  is  true 
at  a  point  (r,  to)  if  p  holds  in  the  “closest”  point  to  (r,  to) 
where  do,  (a)  holds.  What  this  closest  point  is  depends  on 
whether  we  consider  normal-form  games  or  extensive-form 
games.  In  a  normal  form  game,  a  is  a  strategy.  In  that 
case,  do,  (a)  A  p  is  true  at  (r,  to)  if  p  is  true  at  the  point 
(r\  to)  where,  in  run  r' ,  player  i  uses  strategy  a  and  all  the 
other  players  use  the  same  In  an  extensive-form  game,  a  is 
a  move  at  an  information  set.  The  closest  point  to  (r,  to) 
where  do, (a)  is  true  (assuming  that  a  is  an  action  that  i  can 
perform  in  the  local  state  /•;('"))  is  the  point  (r',m)  where 
all  players  other  than  player  i  use  the  same  protocol  in  r' 
and  r,  and  i’s  protocol  in  r'  agrees  with  i’s  protocol  in  r 
except  at  the  local  state  i  does  move  a.  Thus,  r' 

is  the  run  that  results  from  player  i  making  a  single  devia¬ 
tion  (to  a  at  time  to)  from  the  protocol  she  uses  in  r,  and 
all  other  players  use  the  same  protocol  as  in  r.  (This  can 
be  viewed  as  an  instance  of  the  general  semantics  for  coun- 
terfactuals  used  in  the  philosophy  literature  [Lewis,  1973; 

4If  we  allow  players  to  change  strategies  during  a  run,  then  we 
will  in  general  have  different  joint  strategies  at  each  point  in  a  run. 
For  our  theorems  in  the  next  section,  we  restrict  to  contexts  where 
players  do  not  change  strategies. 


Stalnaker,  1968]  where  ip  A  p  is  taken  to  be  true  at  a  world  w 
if  p  is  true  at  all  the  worlds  w'  closest  to  w  where  ip  is  true.) 
Of  course,  if  i  actually  does  a  in  run  r,  then  r'  =  r. 

There  is  a  problem  with  this  approach.  There  is  no  guar¬ 
antee  that,  in  general,  such  a  closest  point  ( r',m )  exists  in 
the  system  VS.  To  deal  with  this  problem,  we  restrict  atten¬ 
tion  to  a  class  of  systems  where  this  point  is  guaranteed  to 
exist.  A  system  ( 1Z ,  p)  is  complete  with  respect  to  context  7 
if  1Z  includes  every  run  generated  by  a  protocol  appropriate 
for  context  7.  In  complete  systems,  the  closest  point  (r',  to) 
is  guaranteed  to  exist.  For  the  remainder  of  the  paper,  we 
evaluate  formulas  only  with  respect  to  complete  systems.  In 
a  complete  system  VS ,  we  define  [do, (a)  7  p\ps  to  consist 
of  all  the  points  (r,  to)  such  that  the  closest  point  (r',  to)  to 
(r,  to)  where  i  does  a  is  in  We  say  that  a  complete 

system  (1Z\  p')  extends  ( 1Z ,  p)  if  p:i  and  //'  agree  on  1Z  (so 
that  p'j  (A)  =  pj  (A))  for  all  ACT?.)  for  j  =  1, . . . ,  n. 

Since  each  formula  k  that  appears  as  a  test  in  a  kb  program 
Pg,  for  player  i  is  a  Boolean  combination  of  formulas  of  the 
form  Bip,  it  is  easy  to  check  that  if  (r,  to)  G  Mps.  then 
ICi(r,m)  C  [c]p5.  In  other  words,  the  truth  of  k  depends 
only  on  i’s  local  state.  Moreover,  since  the  tests  are  mutually 
exclusive  and  exhaustive,  exactly  one  of  them  holds  in  each 
local  state.  Given  a  system  VS,  we  take  the  protocol  PgfS 
to  be  such  that  Pgfs(£)  =  a  j  if,  for  some  point  (r,  to)  in  VS 
with  77  (to)  =  £,  we  have  (r,  to)  G  [/-cyjps.  Since  Ki,  k?,  . . . 
are  mutually  exclusive  and  exhaustive,  there  is  exactly  one 
action  a  j  with  this  property. 

We  are  mainly  interested  in  protocols  that  implement  a  kb 
program.  Intuitively,  a  joint  protocol  P  implements  a  kb 
program  Pg  in  context  7  if  P  performs  the  same  actions  as 
Pg  in  all  runs  of  P  that  have  positive  probability,  assuming 
that  the  knowledge  tests  in  Pg  are  interpreted  with  respect 
to  the  complete  system  VS  extending  R(P,  7).  Formally, 
a  joint  protocol  P  (de  facto)  implements  a  joint  kb  program 
Pg  [Halpern  and  Moses,  2004]  in  a  context  7  =  (Go,p)  if 
Pi  (£)  =  Pg^s  (£)  for  every  local  state  £  =  r.i  (to)  such  that 
r  G  R (P,  7)  and  pi  (r)  ^  0,  where  VS  is  the  complete  sys¬ 
tem  extending  R(P,  7).  We  remark  that,  in  general,  there 
may  not  be  any  joint  protocols  that  implement  a  kb  program 
in  a  given  context,  there  may  be  exactly  one,  or  there  may  be 
more  than  one  (see  [Fagin  et  al.,  1995]  for  examples).  This 
is  somewhat  analogous  to  the  fact  that  there  may  not  be  any 
equilibrium  of  a  game  for  some  notions  of  equilibrium,  there 
may  be  one,  or  there  may  be  more  than  one. 

3  The  Main  Results 

We  start  by  considering  games  in  normal  form.  Fix  a  game  I 
in  normal  form.  Let  P^  be  the  protocol  that,  in  initial  state 
ss  G  Sf,  chooses  strategy  5;  let  Pnf  =  ( /’"( ,w .  ,,P^). 
Let  ST  RAT,  be  the  random  variable  on  initial  global  states 
that  associates  with  an  initial  global  state  s  player  i’s  strat¬ 
egy  in  r.  As  we  said,  Nash  equilibrium  arises  in  contexts 
with  a  common  prior.  Suppose  that  7  =  (Go,p)  is  a  con¬ 
text  with  a  common  prior,  the  mixed  joint  strategy  S  If 


S  is  a  joint  mixed  strategy,  then  it  determines  a  unique 
probability  measure  pg  on  pure  joint  strategies;  note  that 
STRATi, . . STRAT,,  are  independent  with  respect  to  pg. 
Conversely,  if  STR AT  1 , . . . ,  STRAT„  are  independent  with 
respect  to  //,  then  p  determines  a  unique  mixed  strategy  ,S';, . 

Theorem  3.1:  If  the  joint  strategy  S  is  a  Nash  equilibrium  of 
the  game  T,  then  PnS  implements  S  in  the  context  (Qq,  pg). 
Conversely,  if  p  is  common  prior  probability  measure  on  Go 
such  that  STRATi,  •  •  • ,  STRAT„  are  independent  with  re¬ 
spect  to  p  and  Pnf  implements  EQ1  in  the  context  (Go  ,  p), 
then  Sp  is  a  Nash  equilibrium. 

Proof:  Suppose  that  S  is  a  (possibly  mixed  strategy)  Nash 
equilibrium  of  the  game  I  To  see  that  PnS  implements  EQr 
in  the  context  7  =  (Go ,  Pg ).  let  £  =  r.j(O)  be  a  local  state 
such  that  r  =  R( P'S ,  7)  and  p(r )  f  0.  If  £  =  st ,  then 
P.P  (£)  =  T,  so  T  must  be  in  the  support  of  ,Sj .  Thus, 
T  must  be  a  best  response  to  S  ,,  the  joint  strategy  where 
each  player  j  /  i  plays  its  component  of  S.  Since  i  uses 
strategy  T  in  r,  the  formula  B.fdo.^T'))  holds  at  (r,  0)  iff 
T'  =  T.  Moreover,  since  T  is  a  best  response,  if  u  is  1  s 
expected  utility  with  the  joint  strategy  5,  then  for  all  T\ 
the  formula  doj(T')  7  (EUj  <  u)  holds  at  (r,  0).  Thus, 
(EQf)^s(f)  =  T,  where  PS  is  the  complete  system  ex¬ 
tending  R(F"^,  7).  It  follows  that  P’S  implements  EQr. 

For  the  converse,  suppose  that  p  is  a  common  prior  proba¬ 
bility  measure  on  Go  ,  STRATi,  •  •  • ,  STR  AT,,  are  indepen¬ 
dent  with  respect  to  p,  and  I'S  implements  EQr  in  the  con¬ 
text  7  =  (Go  ,  p)-  We  want  to  show  that  S M  is  a  Nash  equi¬ 
librium.  It  suffices  to  show  that  each  pure  strategy  T  in  the 
support  of  (Sfj)i  is  a  best  response  to  S-i.  Since  p  is  com¬ 
patible  with  S],,  there  must  be  a  run  r  such  that  p(r )  >  0  and 

r.j(0)  =  st  (he.,  player  i  chooses  T  in  run  r).  It  since  PnS 
r  r 

implements  EQ  ,  and  in  the  context  7,  EQ  ensures  that  no 
deviation  from  T  can  improve  i’s  expected  utility  with  respect 
to  (Sffj—i,  it  follows  that  T  is  indeed  a  best  response.  | 

As  is  well  known,  players  can  sometimes  achieve  better 
outcomes  than  a  Nash  equilibrium  if  they  have  access  to  a 
helpful  mediator.  Consider  the  simple  2-player  game  de¬ 
scribed  in  Figure  2,  where  Alice,  the  row  player,  must  choose 
between  top  and  bottom  (T  and  B ),  while  Bob,  the  column 
player,  must  choose  between  left  and  right  ( L  and  II): 


L  R 


(3,3) 

w 

w 

Figure  2:  A  simple  2-player  game. 

It  is  not  hard  to  check  that  the  best  Nash  equilibrium  for 
this  game  has  Alice  randomizing  between  T  and  B,  and  Bob 
randomizing  between  L  and  It':  this  gives  each  of  them  ex¬ 
pected  utility  2.  They  can  do  better  with  a  trusted  mediator, 
who  makes  a  recommendation  by  choosing  at  random  be¬ 


tween  ( T,L ),  ( T,R ),  and  (B,L).  This  gives  each  of  them 
expected  utility  8/3.  This  is  a  correlated  equilibrium  since, 
for  example,  if  the  mediator  chooses  (T,  L ),  and  thus  sends 
recommendation  T  to  Alice  and  L  to  Bob,  then  Alice  con¬ 
siders  it  equally  likely  that  Bob  was  told  L  and  It  and  thus 
has  no  incentive  to  deviate;  similarly.  Bob  has  no  incentive  to 
deviate.  In  general,  a  distribution  p  over  pure  joint  strategies 
is  a  correlated  equilibrium  if  players  cannot  do  better  than 
following  a  mediator’s  recommendation  if  a  mediator  makes 
recommendations  according  to  p.  (Note  that,  as  in  our  ex¬ 
ample,  if  a  mediator  chooses  a  joint  strategy  (5i, . . . ,  Sn)  ac¬ 
cording  to  p ,  the  mediator  recommends  ,S)  to  player  z;  player 
i  is  not  told  the  joint  strategy.)  We  omit  the  formal  definition 
of  correlated  equilibrium  (due  to  Aumman  [1974])  here;  how¬ 
ever,  we  stress  that  a  correlated  equilibrium  is  a  distribution 
over  (pure)  joint  strategies.  We  can  easily  capture  correlated 
equilibrium  using  EQ. 

Theorem  3.2:  The  distribution  p  on  joint  strategies  is  a  cor¬ 
related  equilibrium  of  the  game- 1  iff  PnS  implements  EQ 
in  the  context  (<yj,  p). 

Thus,  if  P'S  implements  EQr  in  context  (Go,p)  and 
STRATi,  •  •  • ,  STR  AT,,  are  independent  with  respect  to  p, 
then  the  joint  strategy  S  with  which  p  is  compatible  is  a  Nash 
equilibrium;  if  STRATi,  •  •  • ,  ST  RAT,,  are  not  independent 
with  respect  to  p ,  then  p  is  still  a  correlated  equilibrium. 

Both  Nash  equilibrium  and  correlated  equilibrium  require 
a  common  prior  on  runs.  By  dropping  this  assumption,  we  get 
another  standard  solution  concept:  rationalizability  [Bern- 
heim,  1984;  Pearce,  1984],  Intuitively,  a  strategy  for  player 
z  is  rationalizable  if  it  is  a  best  response  to  some  beliefs  that 
player  i  may  have  about  the  strategies  that  other  players  are 
following,  assuming  that  these  strategies  are  themselves  best 
responses  to  beliefs  that  the  other  players  have  about  strate¬ 
gies  that  other  players  are  following,  and  so  on.  To  make 
this  precise,  we  need  a  little  notation.  Let  A  ,  =  I  \:i/;S:i. 
Let  u.i(S)  denote  player  i’s  utility  if  the  strategy  tuple  S  is 
played.  We  describe  player  i’s  beliefs  about  what  strategies 
the  other  players  are  using  by  a  probability  pi  on  > S_j.  A 
strategy  S  for  player  i  is  a  best  response  to  beliefs  described 
by  a  probability  pi  on  iS_j(T)  if  ^2feS_.  Ui(S,T)pi(T)  > 
y/yy  s  u.fS',  T)p.i(T )  for  all  S'  £  Si.  Following  Osborne 
and  Rubinstein  [1994],  we  say  that  a  strategy  S  for  player  i 
in  game  T  is  rationalizable  if,  for  each  player  j,  there  is  a 
set  Zj  C  Sj(r)  and,  for  each  strategy  T  £  Zj,  a  probability 
measure  pj^  on  <S_j(r)  whose  support  is  Z~j  such  that 

•  S  £  Zt:  and 

•  for  each  player  j  and  strategy  T  £  Zj,  T  is  a  best  re¬ 
sponse  to  the  beliefs  Pj,T- 

For  ease  of  exposition,  we  consider  only  pure  rationaliz¬ 
able  strategies.  This  is  essentially  without  loss  of  generality. 
It  is  easy  to  see  that  a  mixed  strategy  S  for  player  i  is  a  best 
response  to  some  beliefs  pi  of  player  i  iff  each  pure  strategy 
in  the  support  of  5  is  a  best  response  to  pi.  Moreover,  we 
can  assume  without  loss  of  generality  that  the  support  of  pi 
consists  of  only  pure  joint  strategies. 


Theorem  3.3:  A  pure  strategy  S  for  player  i  is  rationalizable 
iff  there  exist  probability  measures  pi, ... ,  pn,  a  set  Go  C 
Gp  and  a  state  s  £  Go  such  that  PP  (sf)  =  S  and  Pnf 
implements  EQr  in  the  context  (do,  f ')■ 

Proof:  First,  suppose  that  Pnf  implements  EQr  in  context 
(do,  P)-  We  show  that  for  each  state  s  £  Go  and  player  i,  the 
strategy  S..-  t  =  PP  (sf)  is  rationalizable.  Let  Zi  =  { ,SA,  : 
s  £  Qo}.  For  S  £  Zi,  let  E(S)  =  {s  £  Go  ■  s*  =  ss};  that 
is,  E(S)  consists  consists  of  all  initial  global  states  where 
player  i’s  local  state  is  ss\  let  p,l:s  =  p%(-  \  E(S))  (under  the 
obvious  identification  of  global  states  in  Go  with  joint  strate¬ 
gies).  Since  Pnf  implements  EQr,  it  easily  follows  that  S 
best  response  to  pi.s-  Hence,  all  the  strategies  in  ZL  are  ratio¬ 
nalizable,  as  desired. 

For  the  converse,  let  Z,  consist  of  all  the  pure  rationaliz¬ 
able  strategies  for  player  i.  It  follows  from  the  definition  of 
rationalizability  that,  for  each  strategy  S  £  Zi.  there  exists 
a  probability  measure  p,l:s  on  Z—i  such  that  5  is  a  best  re¬ 
sponse  to  Pi.s-  For  a  set  Z  of  strategies,  we  denote  by  Z  the 
set  {st  :  T  £  Z}.  Set  Go  =  Z\  x  . . .  x  Zn ,  and  choose  some 
measure  pi  on  Go  such  that  pi( ■  \  E(S))  =  pi.s  for  all  S  £ 
Zi.  (We  can  take  pi  =  fP,SeZ.  aspi,s ,  where  as  £  (0, 1) 

and  PfSeZ.  as  =  1.)  Recall  that  PP  (ss)  =  S  for  all  states 
ss-  It  immediately  follows  that,  for  every  rationalizable  joint 
strategy  S  =  (Si, . . . ,  Sn).  both  s  =  (sSl, ...,  ssJ  £  Go - 
and  S  =  P"f(s).  Since  the  states  in  Go  all  correspond  to 
rationalizable  strategies,  and  by  definition  of  rationalizabil¬ 
ity  each  (individual)  strategy  St  is  a  best  response  to  pi.s, 
it  is  easy  to  check  that  Pnf  implements  EQr  in  the  context 
(Go  ,  P),  as  desired.  | 

We  remark  that  Osborne  and  Rubinstein’s  definition  of  ra¬ 
tionalizability  allows  pjT  to  be  such  that  j  believes  that  other 
players’  strategy  choices  are  correlated.  In  most  of  the  lit¬ 
erature,  players  are  assumed  to  believe  that  other  players’ 
choices  are  made  independently.  If  we  add  that  requirement, 
then  we  must  impose  the  same  requirement  on  the  probability 
measures  pi, . . . ,  pn  in  Theorem  3.3. 

A  number  of  refinements  of  Nash  equilibrium  have  been 
considered  in  normal-form  games.  Here  we  show  this  ap¬ 
proach  can  capture  perhaps  the  best-known  one,  ( trembling- 
hand)  perfect  equilibrium  [Selten,  1975],  Our  result  depends 
on  a  recent  characterization  of  perfect  equilibrium  [Halpern, 
2006]  that  uses  nonstandard  probabilities,  which  can  assign 
infinitesimal  probabilities  to  initial  states  (i.e.,  joint  strate¬ 
gies).  This  characterization  says  that  a  is  a  perfect  equilib¬ 
rium  if  there  exists  a  joint  strategy  a'  consisting  of  completely 
mixed  strategies  that  use  nonstandard  probability  (so  that  <r.( 
assigns  positive,  although  possibly  infinitesimal  probability 
to  each  action  at  every  information  set)  such  that  <r(  differs 
infinitesimally  from  at  and  at  is  a  best  response  to  a'_if  By 
assuming  that  every  joint  strategy  gets  positive  (although  pos¬ 
sibly  infinitesimal)  probability,  we  can  capture  Selten’s  intu- 
tion  for  trembling-hand  equilibrium  without  using  consider¬ 
ing  sequences  of  strategy  profiles,  as  Selten  does. 

It  is  well  known  that  to  every  real  number  r,  there  is  a  clos¬ 
est  standard  real  number  denoted  st(r).  and  read  “the  stan¬ 


dard  part  of  r”:  | r  —  st  (r)  |  is  an  infinitesimal.  Given  a  non¬ 
standard  probability  measure  v,  we  can  define  the  standard 
probability  measure  st  (v)  by  taking  st(v)  (w)  =  st(v(w)). 
When  dealing  with  nonstandard  probabilities,  we  generalize 

the  definition  of  implementation  by  P  performs  the  same 

vs 

actions  as  Pg  in  all  runs  r  of  P  such  that  st  (v)  ( r )  > 
0.  (Note  that  this  does  not  change  the  definition  of  im¬ 
plementation  when  dealing  with  standard  probabilities.)  If 
STRATi, . . . ,  ST  RAT,,  are  independent  with  respect  to  /./, 
then  v  determines  a  unique  (standard)  joint  mixed  strategy 
$St(y)-  However,  given  a  standard  joint  strategy  S.  there  may 
be  a  number  of  nonstandard  strategies  such  that  S  =  SspV) . 

Moreover,  even  if  S  =  Sspv^,  it  does  not  necessarily  follow 
that  STR AT i , . . . ,  STRAT,,  are  independent  with  respect  to 
vi 

Theorem  3.4:  If  the  joint  strategy  S  is  a  perfect  equilib¬ 
rium  of  the  game  T,  then  there  exists  a  nonstandard  prob¬ 
ability  measure  v  that  gives  positive  probability  to  all  ini¬ 
tial  states  such  that  STRATi,  •  •  • ,  STRAT„  are  independent 
with  respect  to  v,  S  =  and  Pnf  implements  EQ  in 

(GP  v).  Conversely,  if  v  is  common  prior  probability  mea¬ 
sure  on  Go  that  gives  positive  probability  to  all  initial  states, 
STRATi,  •  •  • ,  STRAT,,  are  independent  with  respect  to  v, 
and  Pnf  implements  EQr  in  the  context  (fyj,  v),  then  %(,) 
is  a  perfect  equilibrium. 

This  is  again  very  similar  in  spirit  to  Theorem  3.1.  The  key 
difference  is  the  use  of  a  nonstandard  probability  measure. 
Intuitively,  this  forces  5  to  be  a  best  response  even  in  the 
presence  of  “trembles”. 

We  now  consider  extensive-form  games.  Here,  the  form  of 
the  local  state  and  how  it  changes  over  time  becomes  more 
significant.  We  focus  for  now  on  perhaps  the  best-known 
solution  concepts  for  extensive-form  games,  perfect  equilib¬ 
rium  and  sequential  equilibrium  [Kreps  and  Wilson,  1982], 
Both  of  these  solution  concepts  apply  only  to  games  of  per¬ 
fect  recall.  In  these  games,  it  is  the  players  who  have  perfect 
recall.  To  capture  this,  we  work  in  standard  contexts.  Thus,  a 
local  state  now  has  the  form  (ss,  I),  where  5  is  a  pure  strat¬ 
egy  and  /  is  an  information  set.  That  means,  intuitively,  that 
in  an  information  set  /,  a  player  will  know  that  the  informa¬ 
tion  set  is  I,  and  will  also  know  his  strategy,  or,  more  accu¬ 
rately,  the  strategy  that  he  is  supposed  to  be  using  (since  that 
is  encoded  in  the  initial  state).  EQ  charcterizes  perfect  equi¬ 
librium  and  sequential  equilibrium  in  extensive-form  games 
of  perfect  recall,  provided  we  restrict  to  standard  contexts. 
Let  P.P  be  the  protocol  that,  in  a  state  (ss,  I),  does  the  move 
S(I). 

We  can  characterize  perfect  equilibrium  in  extensive-form 
games  of  perfect  recall  the  same  way  we  did  in  normal-form 
games;  we  simply  replace  Pnf  in  Theorem  3.4  with  P'f 
However,  as  we  said,  we  do  needto  assume  that  contexts  are 
standard. 

’They  are  “almost  independent”  in  the  sense  that  they  the  proba¬ 
bility  of  i  choosing  strategy  S  and  j  choosing  strategy  S'  differs  only 
infinitesimally  from  the  product  of  the  probability  that  i  chooses  S 
and  the  probability  that  j  chooses  S' . 


Theorem  3.5 :  If  the  joint  strategy  S  is  a  perfect  equi¬ 
librium  of  a  game  T  of  perfect  recall  in  extensive  form, 
then  there  exists  a  nonstandard  probability  measure  v  that 
gives  positive  probability  to  all  initial  states  such  that 
STRATi, .  . . ,  ST  R  AT,,  are  independent  with  respect  to  v, 
S  =  s%i{v),  and  Pef  implements  EQ  in  the  standard  context 
(Qq  ,  v).  Conversely,  if  v  is  common  prior  probability  mea¬ 
sure  on  Go  that  gives  positive  probability  to  all  initial  states, 
STRATi,  •  •  • ,  STRAT,,  are  independent  with  respect  to  v, 
and  Pef  implements  EQ  in  the  standard  context  (Qq,u), 
then  iSs t(„)  is  a  perfect  equilibrium. 

We  next  characterize  sequential  equilibrium  in  terms  of 
EQ.  We  again  depend  on  Halpern’s  [2006]  characterization 
of  sequential  equilibrium  using  nonstandard  probability.  The 
only  difference  between  sequential  equilibrium  and  perfect 
equilibrium  in  this  characerization  is  that  with  perfect  equi¬ 
librium  <Ji  must  be  a  best  response  to  a '_it  while  with  sequen¬ 
tial  equilibrium,  it  must  just  be  an  e-best  response,  for  some 
infinitesimal  e.  To  capture  this  difference,  when  dealing  with 
sequential  equilibrium,  the  expression  “PI  j  =  x”  in  EQr  is 
interpreted  as  “the  standard  part  of  is  expected  utility  is  x” 
That  is,  when  dealing  with  perfect  equilibrium,  x  ranges  over 
the  nonstandard  reals;  when  dealing  with  sequential  equilib¬ 
rium,  x  ranges  over  the  standard  reals.  The  effect  of  interpret¬ 
ing  “EU,  =  x”  as  “the  standard  part  of  i’s  expected  utility  is 
x”  is  that  we  ignore  infinitesimal  differences.  Thus,  for  exam¬ 
ple,  the  move  made  by  a  strategy  P^  (so)  at  an  information 
set  /  might  not  be  a  best  response  to  the  distribution  of  moves 
made  by  the  remaining  players  at  /;  it  may  just  be  an  e-best 
response  for  some  infinitesimal  e. 

Theorem  3.6:  If  I’  is  an  extensive-form  game  with  perfect 
recall  and  there  is  a  belief  system  j3  such  that  ( S ,  j3)  is  a 
sequential  equilibrium  ofT,  then  there  exists  a  nonstandard 
measure  v  on  Go  compatible  with  S  that  gives  positive  (al¬ 
though  possibly  infinitesimal)  probability  to  all  initial  states 
such  that  STRATi,  •  •  • ,  STRAT„  are  independent  with  re¬ 
spect  to  v  and  Pe?  implements  EQ1  in  the  standard  context 
(Go  ,  v)-  Conversely,  if  v  is  common  prior  probability  mea¬ 
sure  on  Go  that  gives  positive  probability  to  all  initial  states, 
STRATi,  •  •  • ,  STRAT,,  are  independent  with  respect  to  v, 
and  Pef  de  facto  implements  EQr  in  the  context  (Go,v), 
then  there  is  a  belief  system  (3  such  that  ( S ,  0)  is  a  sequential 
equilibrium,  where  S  is  the  unique  joint  strategy  compatible 
with  v. 

4  Conclusions 

We  have  shown  how  a  number  of  different  solution  con¬ 
cepts  from  game  theory  can  be  captured  by  essentially  one 
knowledge-based  program,  which  comes  in  two  variants:  one 
appropriate  for  normal-form  games  and  one  for  extensive- 
form  games.  The  differences  between  these  solution  concepts 
is  captured  by  changes  in  the  context  in  which  the  games  are 
played:  whether  players  have  a  common  prior  (for  Nash  equi¬ 
librium,  correlated  equilibrium,  perfect  equilibrium,  and  se¬ 
quential  equilibrium)  or  not  (for  rationalizability),  whether 
strategies  are  chosen  independently  (for  Nash  equilibrium. 


perfect  equilibrium,  and  sequential  equilibrium,  and  rational¬ 
izability)  or  not  (for  correlated  equilibrium);  and  whether 
uncertainty  is  represented  using  a  standard  or  nonstandard 
probability  measure. 

Our  results  can  be  viewed  as  showing  that  each  of  these  so¬ 
lution  concepts  sc  can  be  characterized  in  terms  of  common 
knowledge  of  rationality  (since  the  kb  programs  EQr  and 
EQr  embody  rationality,  and  we  are  interested  in  systems 
“generated”  by  these  program,  so  that  rationality  holds  at  all 
states),  and  common  knowledge  of  some  other  features  Xsc 
captured  by  the  context  appropriate  for  sc  (e.g.,  that  strate¬ 
gies  are  chosen  independently  or  that  the  prior).  Roughly 
speaking,  our  results  say  that  if  X8C  is  common  knowledge 
in  a  system,  then  common  knowledge  of  rationality  implies 
that  the  strategies  used  must  satisfy  solution  concept  sc;  con¬ 
versely,  if  a  joint  strategy  S  satisfies  sc,  then  there  is  a  sys¬ 
tem  where  XL*  is  common  knowledge,  rationality  is  common 
knowledge,  and  S  is  being  played  at  some  state.  Results  sim¬ 
ilar  in  spirit  have  been  proved  for  rationalizability  [Branden- 
burger  and  Dekel,  187]  and  correlated  equilibrium  [Aumann, 
1987],  Our  approach  allows  us  to  unify  and  extend  these 
results  and,  as  suggested  in  the  introduction,  applies  even  to 
settings  where  the  game  is  not  common  knowledge,  in  set¬ 
tings  where  uncertainty  is  not  represented  by  probability,  and 
(in  the  case  of  extensive-form  games)  where  the  game  is  not 
one  of  perfect  recall. 

Indeed,  consider  the  game  of  Figure  1  again.  It  is  not  hard 
to  show  that  Pef  implements  /  in  the  standard  context  that 
gives  probability  1  to  the  state  where  the  player  plays  /.  In 
this  context,  /'  is  not  a  strategy,  since  the  player  must  make 
the  same  move  at  both  nodes  in  the  information  set.  However, 
suppose  we  change  the  set  of  states  so  that  the  can  keep  track 
of  his  current  strategy  he  is  using  in  his  local  state.  When  us¬ 
ing  the  strategy  of  playing  B  at  both  x\  and  x->,  but  switching 
from  /  to  /'  at  x?,  his  local  state  at  X3  would  be  (/,  { x 3, 2:4}), 
while  his  local  state  at  2:4  would  be  (/',  {2:3, 2:4});  that  is,  he 
has  different  local  states  at  2:3  and  2:4.  Thus,  even  though  2:3 
and  2:4  are  supposed  to  be  are  in  the  same  information  set, 
the  player  can  distinguish  these  nodes.  (This  observation  was 
originally  made  in  [Halpern,  1997].)  Let  g  be  the  strategy 
of  switching  from  /  to  /'  at  x.  It  is  not  hard  to  show  that 
Pef  implements  g  in  the  (nonstandard)  context  that  allows 
local  states  where  the  agent  keeps  track  of  strategy  changes 
and  where  the  state  where  the  player  plays  g  gets  probability 
1 .  (This  discussion  is  basically  a  reformulation  of  the  points 
made  by  Halpern  11997]  in  the  framework  of  this  of  this  pa¬ 
per.) 

As  this  example  shows,  as  long  as  we  use  the  appropri¬ 
ate  context,  whether  or  not  we  have  perfect  recall,  this  ap¬ 
proach  gives  the  “right”  answer.  We  believe  that  the  approach 
captures  the  essence  of  the  intuition  that  a  solution  concept 
should  embody  common  knowledge  of  rationality. 
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